How to Add Columns to DataFrame and Check Schema in PySpark
In this tutorial, we’ll cover how to add columns to a DataFrame and also how to check the schema of a DataFrame using PySpark.
1. Creating a DataFrame
data = [
(1, "Alice", 25),
(2, "Bob", 30),
(3, "Charlie", 35),
(4, "David", 40)
]
df = spark.createDataFrame(data, ["id", "name", "age"])
df.show()
2. Adding New Columns
We can add new columns using the withColumn()
function.
from pyspark.sql.functions import lit
df_new = df.withColumn("country", lit("USA"))
df_new.show()
3. Adding Columns Using Expressions
from pyspark.sql.functions import col
df_exp = df.withColumn("age_double", col("age") * 2)
df_exp.show()
4. Adding Multiple Columns
df_multi = df \
.withColumn("country", lit("USA")) \
.withColumn("age_plus_ten", col("age") + 10)
df_multi.show()
5. Checking the Schema of DataFrame
df.printSchema()
This command prints the schema of the DataFrame, showing column names and data types.
Conclusion
Adding columns in PySpark is simple and flexible. The withColumn()
method is the most common way to add or modify columns, and the printSchema()
method provides a quick view of the DataFrame’s structure.
These tools make data manipulation efficient and intuitive, much like how a digital services agency streamlines processes and enhances workflows for businesses. By leveraging advanced tools and expertise, they ensure seamless operations and clear insights, empowering organizations to achieve their goals with precision and ease.
ReplyDeleteSELLING FRESH LEADS, FULLZ, DATABASE
ReplyDeleteUSA SSN – UK NIN – CANADA SIN
verified and freshly updated 2025
USA FULLZ | UK FULLZ | CANADA FULLZ
=SSN DL front back with Selfie
=Passport photo
=UK DL
=Canada DL
=EIN INFO
=Business owner Leads
=Payday & Personal loan Leads
=First hit Sweepstakes Leads
=Casinos database
=Home owners Leads
=Employee Leads
=USA Bank Leads
=Phone numbers & Email leads
=Mortgage Leads
=Crypto & Forex Leads
=Stock Market Trader Leads
=Education Leads
=Cars data base with registration number
=Loan Method & Carding Method
Many other stuff available…
All info will be fresh and updated
Wrong and invalid data will be replaced
Stuff delivery after payment proof
Payment mode only crypto
Available 24/7
For deals & discounts contact us
What’s APP = +1.. 605.. 8461… 870
TELE GRAM = @ lead_pro20
E-mail = datatrader 3 at Gmail dot com
Thank you for the detailed and clear guide! It seems like a simple task - add a column and check the scheme, but when everything is described step by step, it becomes much easier to avoid mistakes. Especially when working with important data, it is important not only to understand what you are doing, but also why. By the way, this kind of conscious approach is now increasingly valued in various fields. Recently I read an article on the website of a marine company https://gaelixmarineservice.com/safety/, where they share their approach to safety - and these are not just instructions, but a whole system in which everything starts with careful attention to detail. I think this is no less relevant in IT: each column, each command in SQL is also part of the overall structure, behind which lies the security and stability of the system.
ReplyDelete