PySpark Date and Timestamp Types Explained
In this blog, we will understand how PySpark handles DateType, TimestampType, and various Interval types using real-world examples. This is part of our complete PySpark tutorial series.
🔹 Step 1: Import & Session Setup
from pyspark.sql import SparkSession
spark = SparkSession.builder.appName("DatetimeTypesDemo").getOrCreate()
🔹 Step 2: Create Sample Data
from pyspark.sql.functions import current_date, current_timestamp
df = spark.range(1).select(
current_date().alias("current_date"),
current_timestamp().alias("current_timestamp")
)
df.show(truncate=False)
Output:
+-------------+-----------------------+
|current_date |current_timestamp |
+-------------+-----------------------+
|2025-04-08 |2025-04-08 14:12:34.123|
+-------------+-----------------------+
🔹 Step 3: Interval Types
from pyspark.sql.functions import expr
df_interval = spark.sql("SELECT INTERVAL '3 12:15:32' DAY TO SECOND AS day_to_sec")
df_interval.show()
Output:
+-----------------+
|day_to_sec |
+-----------------+
|3 12:15:32.000000|
+-----------------+
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.