What is DataFrame in PySpark?
A DataFrame in PySpark is a distributed collection of data organized into named columns. It is similar to a table in a relational database or an Excel spreadsheet. DataFrames allow you to process large amounts of data efficiently by using multiple computers at the same time.
Key Features
- Structured Data: Organizes data into rows and columns.
- Fast and Scalable: Handles large datasets effectively.
- Data Source Flexibility: Works with CSV, JSON, Parquet, databases, etc.
- SQL Queries: Supports SQL-like queries for filtering and grouping data.
Example: Creating a DataFrame
from pyspark.sql import SparkSession
from pyspark.sql import Row
# Create a SparkSession
spark = SparkSession.builder.appName("DataFrameExample").getOrCreate()
# Create data as a list of Row objects
data = [
Row(id=1, name="Alice", age=25),
Row(id=2, name="Bob", age=30),
Row(id=3, name="Charlie", age=35)
]
# Create DataFrame
df = spark.createDataFrame(data)
# Show DataFrame content
df.show()
Output
+---+-------+---+
| id| name|age|
+---+-------+---+
| 1| Alice| 25|
| 2| Bob| 30|
| 3|Charlie| 35|
+---+-------+---+
Conclusion
PySpark DataFrames are an essential tool for working with structured and semi-structured data in big data processing. They provide an easy-to-use API for data manipulation and analysis.
SELLING FRESH LEADS, FULLZ, DATABASE
ReplyDeleteUSA SSN – UK NIN – CANADA SIN
verified and freshly updated 2025
USA FULLZ | UK FULLZ | CANADA FULLZ
=SSN DL front back with Selfie
=Passport photo
=UK DL
=Canada DL
=EIN INFO
=Business owner Leads
=Payday & Personal loan Leads
=First hit Sweepstakes Leads
=Casinos database
=Home owners Leads
=Employee Leads
=USA Bank Leads
=Phone numbers & Email leads
=Mortgage Leads
=Crypto & Forex Leads
=Stock Market Trader Leads
=Education Leads
=Cars data base with registration number
=Loan Method & Carding Method
Many other stuff available…
All info will be fresh and updated
Wrong and invalid data will be replaced
Stuff delivery after payment proof
Payment mode only crypto
Available 24/7
For deals & discounts contact us
What’s APP = +1.. 605.. 8461… 870
TELE GRAM = @ lead_pro20
E-mail = datatrader 3 at Gmail dot com