Skip to main content

Pyspark action df.show() returns Java error

I was setting up my Spark development via Anaconda package on my Windows10 desktop. I used to have this set up earlier in the same machine working fine..was doing some cleanup and installing fresh again...but I am now getting issues when I invoke spark to show the data. Initializing, loading data to a data frame, importing libraries are all fine...until I call the action show ....something to do with my environment setting, what am I doing wrong?

Environment:

spark-3.1.2-bin-hadoop2.7 (SPARK_HOME & HADOOP_HOME)    
jdk1.8.0_281 (JAVA_HOME)    
Anaconda Spyder IDE    
winutils (for hadoop 2.7.7)

Python 3.9.7 (default, Sep 16 2021, 16:59:28) [MSC v.1916 64 bit (AMD64)] Type "copyright", "credits" or "license" for more information.

IPython 7.29.0 -- An enhanced Interactive Python.

import pyspark
from pyspark.sql import SparkSession
from pyspark.sql import Row
from pyspark.sql.types import StringType, StructType, StructField
from pyspark.sql import SparkSession
import pyspark.ml

spark = SparkSession.builder.getOrCreate()

data2 = [(1, "James Smith"), (2, "Michael Rose"),
         (3, "Robert Williams"), (4, "Rames Rose"), (5, "Rames rose")
         ]
df2 = spark.createDataFrame(data=data2, schema=["id", "name"])

df2.printSchema()
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
root
 |-- id: long (nullable = true)
 |-- name: string (nullable = true)

df2.show()

Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
[Stage 0:>                                                          (0 + 1) / 1]
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
Python was not found; run without arguments to install from the Microsoft Store, or disable this shortcut from Settings > Manage App Execution Aliases.
Traceback (most recent call last):

File "C:\Users\***\AppData\Local\Temp/ipykernel_25396/2272422252.py", line 1, in <module>
    df2.show()

  File "C:\Users\***\anaconda3\lib\site-packages\pyspark\sql\dataframe.py", line 484, in show
    print(self._jdf.showString(n, 20, vertical))

  File "C:\Users\***\anaconda3\lib\site-packages\py4j\java_gateway.py", line 1309, in __call__
    return_value = get_return_value(

  File "C:\Users\***\anaconda3\lib\site-packages\pyspark\sql\utils.py", line 111, in deco
    return f(*a, **kw)

  File "C:\Users\***\anaconda3\lib\site-packages\py4j\protocol.py", line 326, in get_return_value
    raise Py4JJavaError(

Py4JJavaError: An error occurred while calling o36.showString.
: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most recent failure: Lost task 0.0 in stage 0.0 (TID 0) (DellXPS executor driver): org.apache.spark.SparkException: Python worker failed to connect back.
    at org.apache.spark.api.python.PythonWorkerFactory.createSimpleWorker(PythonWorkerFactory.scala:182)


source https://stackoverflow.com/questions/70174906/pyspark-action-df-show-returns-java-error

Comments

Popular posts from this blog

Prop `className` did not match in next js app

I have written a sample code ( Github Link here ). this is a simple next js app, but giving me error when I refresh the page. This seems to be the common problem and I tried the fix provided in the internet but does not seem to fix my issue. The error is Warning: Prop className did not match. Server: "MuiBox-root MuiBox-root-1" Client: "MuiBox-root MuiBox-root-2". Did changes for _document.js, modified _app.js as mentioned in official website and solutions in stackoverflow. but nothing seems to work. Could someone take a look and help me whats wrong with the code? Via Active questions tagged javascript - Stack Overflow https://ift.tt/2FdjaAW

How to show number of registered users in Laravel based on usertype?

i'm trying to display data from the database in the admin dashboard i used this: <?php use Illuminate\Support\Facades\DB; $users = DB::table('users')->count(); echo $users; ?> and i have successfully get the correct data from the database but what if i want to display a specific data for example in this user table there is "usertype" that specify if the user is normal user or admin i want to user the same code above but to display a specific usertype i tried this: <?php use Illuminate\Support\Facades\DB; $users = DB::table('users')->count()->WHERE usertype =admin; echo $users; ?> but it didn't work, what am i doing wrong? source https://stackoverflow.com/questions/68199726/how-to-show-number-of-registered-users-in-laravel-based-on-usertype

Why is my reports service not connecting?

I am trying to pull some data from a Postgres database using Node.js and node-postures but I can't figure out why my service isn't connecting. my routes/index.js file: const express = require('express'); const router = express.Router(); const ordersCountController = require('../controllers/ordersCountController'); const ordersController = require('../controllers/ordersController'); const weeklyReportsController = require('../controllers/weeklyReportsController'); router.get('/orders_count', ordersCountController); router.get('/orders', ordersController); router.get('/weekly_reports', weeklyReportsController); module.exports = router; My controllers/weeklyReportsController.js file: const weeklyReportsService = require('../services/weeklyReportsService'); const weeklyReportsController = async (req, res) => { try { const data = await weeklyReportsService; res.json({data}) console