Categories: java, apache-spark, dataset

Resolved attribute(s) newvalue#10 missing in Java spark

1 answer

I am trying to run the following code

        SparkSession spark = SparkSession                 .builder()                 .appName("test")                 .master("local") //                .enableHiveSupport()                 .getOrCreate();         List<String> list=new ArrayList<String>();         list.add("HI");         list.add("HI");         list.add("HI");         Dataset<Row> dataDs = spark.createDataset(list, Encoders.STRING()).toDF();         List<String> list2=new ArrayList<String>();         list2.add("1");         list2.add("2");         list2.add("3");         Dataset<Row> dataDs2 = spark.createDataset(list2, Encoders.STRING()).toDF().withColumnRenamed("value","newvalue");         Column col=dataDs2.col("newvalue");         dataDs=dataDs.withColumn("newcol",col);         dataDs.show(); 

However, an error is popping up saying that

Exception in thread "main" org.apache.spark.sql.AnalysisException: resolved attribute(s) newvalue#10 missing from value#1 in operator !Project [value#1, newvalue#10 AS newcol#13];; !Project [value#1, newvalue#10 AS newcol#13]

When I searched about it online, it says there might be a case of duplicate column names. However, my columns names are different. dataDs has column name as 'value' while dataDs2 has column name 'newvalue'. So, I am not getting why the error is still happening. Can someone help me out?

All answers to this question, which has the identifier 60948448

The best answer:

The problem is here:

Column col=dataDs2.col("newvalue"); dataDs=dataDs.withColumn("newcol",col); 

You col is a column from dataDs2() you can not use it in dataDS.

It looks like you want to zip() two dataframes. there is RDD.zip() function for it. See more methods here: How to zip two (or more) DataFrame in Spark

Last questions

how do i remove the switch on my home screen?
how to edit the JS date and time to update atuomatically?
How to utilize data stored in a multidimensional array
Powermockito not mocking URL constructor in URI.toURL() method
Android Bluetooth LE Scanner only scans when phone's Location is turned on in some devices
docker wordpress container can't connect to mysql container
How can I declare a number in java that is more than 64-bits? [duplicate]
Optaplanner solutionClass entityCollectionProperty should never return null error when simple JSON object passed to controller
Anylogic, get the time a pedestrain is in a queue
How do I fix this syntax issue with my .flex file?
Optimizing query in PHP
How to find the highest number of a column and print two columns of that row in R?
Ideas on “Error: Type com.google.firebase.iid.zzav is referenced as an interface from com.google.firebase.messaging.zzd”?
JCIFS SmbFile.exists() and SmbFile.isDirectory() return false when it exists and I can listFiles()
PHP total order
Laravel booking system design
neural net - undefined column selected
How to indicate y axis does not start from 0 in ggplot?
Fragments in backStack
Spinner how to change the data