PySpark Tutorial: PySpark isLocal Function : Check If DataFrame Operations Run Locally #pyspark

PySpark isLocal() Function Tutorial | Check If DataFrame Operations Run Locally

PySpark isLocal() Function Tutorial

The isLocal() function in PySpark is used to check whether your DataFrame is running in local mode. This can be helpful for debugging or validating whether your code is running on a distributed cluster or a local machine.

Step 1: Import SparkSession


from pyspark.sql import SparkSession

spark = SparkSession.builder.appName("isLocal Example").getOrCreate()

Step 2: Create Sample Data


data = [
    ("Aamir Shahzad", "Pakistan", 30),
    ("Ali Raza", "USA", 28),
    ("Bob", "UK", 45),
    ("Lisa", "Canada", 33)
]
columns = ["Name", "Country", "Age"]
df = spark.createDataFrame(data, schema=columns)

Step 3: Show the DataFrame


df.show()

Output:


+--------------+--------+---+
|         Name | Country|Age|
+--------------+--------+---+
|Aamir Shahzad |Pakistan| 30|
|Ali Raza      |USA     | 28|
|Bob           |UK      | 45|
|Lisa          |Canada  | 33|
+--------------+--------+---+

Step 4: Use isLocal() to Check Execution Mode


try:
    print("Is this DataFrame local? =", df.isLocal())
except Exception as e:
    print("⚠️ df.isLocal() is not supported in this environment:", str(e))

Output (example):


Is this DataFrame local? = False

📺 Watch the Full Tutorial on YouTube

No comments:

Post a Comment