question archive Implement PySpark code using DataFrames, RDDs or Spark UDF functions: Find all companies with the name that is only two words (e

Implement PySpark code using DataFrames, RDDs or Spark UDF functions: Find all companies with the name that is only two words (e

Subject:Computer SciencePrice: Bought3

Implement PySpark code using DataFrames, RDDs or Spark UDF functions:

  1. Find all companies with the name that is only two words (e.g. : "Goldman Sachs") 
    • print the count of such companies and show()only the name and location (city, region, country_code) in the resulting Spark DataFrame
  2. Find all companies located in California:
    • print the count of such companies and show()only the name and location (city, region, country_code) in the resulting Spark DataFrame
  3. Add a "Blog" column to the DataFrame with the row entries set to 1 if the "domain" field contains "blogspot.com", and 0 otherwise.
    • show() only the name, location (city, region, country_code) and "Blog" column for companies with the "Blog" field marked as 1
  4. Find all companies with names that are palindromes (name reads the same way forward and reverse, e.g. madam) using Spark UDF function:
    • print the count and show() only the name and location (city, region, country_code) in the resulting Spark DataFrame 

pur-new-sol

Purchase A New Answer

Custom new solution created by our subject matter experts

GET A QUOTE

Related Questions