What is a DataFrame in PySpark? | How to create DataFrame from Static Values | PySpark Tutorial

What is DataFrame? - PySpark Tutorial

What is DataFrame in PySpark?

A DataFrame in PySpark is a distributed collection of data organized into named columns. It is similar to a table in a relational database or an Excel spreadsheet. DataFrames allow you to process large amounts of data efficiently by using multiple computers at the same time.

Key Features

  • Structured Data: Organizes data into rows and columns.
  • Fast and Scalable: Handles large datasets effectively.
  • Data Source Flexibility: Works with CSV, JSON, Parquet, databases, etc.
  • SQL Queries: Supports SQL-like queries for filtering and grouping data.

Example: Creating a DataFrame

from pyspark.sql import SparkSession
from pyspark.sql import Row

# Create a SparkSession
spark = SparkSession.builder.appName("DataFrameExample").getOrCreate()

# Create data as a list of Row objects
data = [
    Row(id=1, name="Alice", age=25),
    Row(id=2, name="Bob", age=30),
    Row(id=3, name="Charlie", age=35)
]

# Create DataFrame
df = spark.createDataFrame(data)

# Show DataFrame content
df.show()

Output

+---+-------+---+
| id|   name|age|
+---+-------+---+
|  1|  Alice| 25|
|  2|    Bob| 30|
|  3|Charlie| 35|
+---+-------+---+

Conclusion

PySpark DataFrames are an essential tool for working with structured and semi-structured data in big data processing. They provide an easy-to-use API for data manipulation and analysis.

Watch on YouTube

1 comment:

  1. SELLING FRESH LEADS, FULLZ, DATABASE
    USA SSN – UK NIN – CANADA SIN
    verified and freshly updated 2025

    USA FULLZ | UK FULLZ | CANADA FULLZ
    =SSN DL front back with Selfie
    =Passport photo
    =UK DL
    =Canada DL
    =EIN INFO
    =Business owner Leads
    =Payday & Personal loan Leads
    =First hit Sweepstakes Leads
    =Casinos database
    =Home owners Leads
    =Employee Leads
    =USA Bank Leads
    =Phone numbers & Email leads
    =Mortgage Leads
    =Crypto & Forex Leads
    =Stock Market Trader Leads
    =Education Leads
    =Cars data base with registration number
    =Loan Method & Carding Method
    Many other stuff available…

    All info will be fresh and updated
    Wrong and invalid data will be replaced
    Stuff delivery after payment proof
    Payment mode only crypto
    Available 24/7

    For deals & discounts contact us
    What’s APP = +1.. 605.. 8461… 870
    TELE GRAM = @ lead_pro20
    E-mail = datatrader 3 at Gmail dot com

    ReplyDelete