Skip to main content

Pyspark action df.show() returns Java error

I was setting up my Spark development via Anaconda package on my Windows10 desktop. I used to have this set up earlier in the same machine working fine..was doing some cleanup and installing fresh again...but I am now getting issues when I invoke spark to show the data. Initializing, loading data to a data frame, importing libraries are all fine...until I call the action show ....something to do with my environment setting, what am I doing wrong?

Environment:

spark-3.1.2-bin-hadoop2.7 (SPARK_HOME & HADOOP_HOME)    
jdk1.8.0_281 (JAVA_HOME)    
Anaconda Spyder IDE    
winutils (for hadoop 2.7.7)

Python 3.9.7 (default, Sep 16 2021, 16:59:28) [MSC v.1916 64 bit (AMD64)] Type "copyright", "credits" or "license" for more information.

IPython 7.29.0 -- An enhanced Interactive Python.

import pyspark
from pyspark.sql import SparkSession
from pyspark.sql import Row
from pyspark.sql.types import StringType, StructType, StructField
from pyspark.sql import SparkSession
import pyspark.ml

spark = SparkSession.builder.getOrCreate()

data2 = [(1, "James Smith"), (2, "Michael Rose"),
         (3, "Robert Williams"), (4, "Rames Rose"), (5, "Rames rose")
         ]
df2 = spark.createDataFrame(data=data2, schema=["id", "name"])

df2.printSchema()
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
root
 |-- id: long (nullable = true)
 |-- name: string (nullable = true)

df2.show()

Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
[Stage 0:>                                                          (0 + 1) / 1]
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
Python was not found; run without arguments to install from the Microsoft Store, or disable this shortcut from Settings > Manage App Execution Aliases.
Traceback (most recent call last):

File "C:\Users\***\AppData\Local\Temp/ipykernel_25396/2272422252.py", line 1, in <module>
    df2.show()

  File "C:\Users\***\anaconda3\lib\site-packages\pyspark\sql\dataframe.py", line 484, in show
    print(self._jdf.showString(n, 20, vertical))

  File "C:\Users\***\anaconda3\lib\site-packages\py4j\java_gateway.py", line 1309, in __call__
    return_value = get_return_value(

  File "C:\Users\***\anaconda3\lib\site-packages\pyspark\sql\utils.py", line 111, in deco
    return f(*a, **kw)

  File "C:\Users\***\anaconda3\lib\site-packages\py4j\protocol.py", line 326, in get_return_value
    raise Py4JJavaError(

Py4JJavaError: An error occurred while calling o36.showString.
: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most recent failure: Lost task 0.0 in stage 0.0 (TID 0) (DellXPS executor driver): org.apache.spark.SparkException: Python worker failed to connect back.
    at org.apache.spark.api.python.PythonWorkerFactory.createSimpleWorker(PythonWorkerFactory.scala:182)


source https://stackoverflow.com/questions/70174906/pyspark-action-df-show-returns-java-error

Comments

Popular posts from this blog

Where and how is this Laravel kernel constructor called? [closed]

Where and how is this Laravel kernel constructor called? public fucntion __construct(Application $app, $Router $roouter) { } I have read the documentation and some online tutorial but I can find any clear explanation. I am learning Laravel and I am wondering where does this kernel constructor receives its arguments from. "POSTMOTERM" CLARIFICATION: Here is more clarity.I have checked the boostrap/app.php and it is only used for boostrapping the interfaces into the container class. What is not clear to me is where and how the Kernel class is instatiated and the arguments passed to the object calling the constructor.Something similar to; obj = new kernel(arg1,arg2) or, is the framework using some magic functions somewhere? Special gratitude to those who burn their eyeballs and brain cells on this trivia before it goes into a full blown menopause alias "MARKED AS DUPLICATE". To some of the itchy-finger keyboard warriors, a.k.a The mods,because I believe in th...

Why is my reports service not connecting?

I am trying to pull some data from a Postgres database using Node.js and node-postures but I can't figure out why my service isn't connecting. my routes/index.js file: const express = require('express'); const router = express.Router(); const ordersCountController = require('../controllers/ordersCountController'); const ordersController = require('../controllers/ordersController'); const weeklyReportsController = require('../controllers/weeklyReportsController'); router.get('/orders_count', ordersCountController); router.get('/orders', ordersController); router.get('/weekly_reports', weeklyReportsController); module.exports = router; My controllers/weeklyReportsController.js file: const weeklyReportsService = require('../services/weeklyReportsService'); const weeklyReportsController = async (req, res) => { try { const data = await weeklyReportsService; res.json({data}) console...

How to show number of registered users in Laravel based on usertype?

i'm trying to display data from the database in the admin dashboard i used this: <?php use Illuminate\Support\Facades\DB; $users = DB::table('users')->count(); echo $users; ?> and i have successfully get the correct data from the database but what if i want to display a specific data for example in this user table there is "usertype" that specify if the user is normal user or admin i want to user the same code above but to display a specific usertype i tried this: <?php use Illuminate\Support\Facades\DB; $users = DB::table('users')->count()->WHERE usertype =admin; echo $users; ?> but it didn't work, what am i doing wrong? source https://stackoverflow.com/questions/68199726/how-to-show-number-of-registered-users-in-laravel-based-on-usertype