PySpark isLocal() Function Tutorial
The isLocal() function in PySpark is used to check whether your DataFrame is running in local mode. This can be helpful for debugging or validating whether your code is running on a distributed cluster or a local machine.
Step 1: Import SparkSession
from pyspark.sql import SparkSession
spark = SparkSession.builder.appName("isLocal Example").getOrCreate()
Step 2: Create Sample Data
data = [
    ("Aamir Shahzad", "Pakistan", 30),
    ("Ali Raza", "USA", 28),
    ("Bob", "UK", 45),
    ("Lisa", "Canada", 33)
]
columns = ["Name", "Country", "Age"]
df = spark.createDataFrame(data, schema=columns)
Step 3: Show the DataFrame
df.show()
Output:
+--------------+--------+---+
|         Name | Country|Age|
+--------------+--------+---+
|Aamir Shahzad |Pakistan| 30|
|Ali Raza      |USA     | 28|
|Bob           |UK      | 45|
|Lisa          |Canada  | 33|
+--------------+--------+---+
Step 4: Use isLocal() to Check Execution Mode
try:
    print("Is this DataFrame local? =", df.isLocal())
except Exception as e:
    print("⚠️ df.isLocal() is not supported in this environment:", str(e))
Output (example):
Is this DataFrame local? = False



No comments:
Post a Comment
Note: Only a member of this blog may post a comment.