question archive 1 How many instances and features (attributes) does it have? (use shape on the data frame) 2 Use the head method to retrieve the first 5 records

1 How many instances and features (attributes) does it have? (use shape on the data frame) 2 Use the head method to retrieve the first 5 records

Subject:Computer SciencePrice:2.84 Bought7

1 How many instances and features (attributes) does it have? (use shape on the data frame)

2 Use the head method to retrieve the first 5 records.

3 Output the attribute names using the columns property.

4 What car model is the one with the highest mileage?

5 How many cars have more than 250,000 kilometers?

6 do a box plot with the mileage by fuel type. Is there any interesting insight/pattern?

7 Plot a histogram of the variable price, what is the distribution like?

pur-new-sol

Purchase A New Answer

Custom new solution created by our subject matter experts

GET A QUOTE

Answer Preview

The complete Python code is as following,

 

We import our pandas library and import our dataset using the read_csv function of pandas,

import pandas as pd
#replace with your local path to the dataset
df = pd.read_csv(r'C:\Users\edwin\Downloads\archive\ToyotaCorolla.csv')

 

1. We understood the no of rows and attributes using shape property,

df.shape 

 

2. We retrieving first five records using head method of pandas,

df.head(5)

 

3. The  column names are retrieved using column property,

df.columns

 

4. The car with highest mileage is,

df[df["KM"] == df["KM"].max()]

 

5. The cars that used more than 2,50,000 KMs are,

df[df['KM']>250000]

 

6. Boxplot with mileage and fuel type is as,

import seaborn as sns
sns.boxplot(x = df['Fuel_Type'], y = df['KM'])

 

7. The histogram for the attribute Price is plotted as,

import matplotlib.pyplot as plt
plt.hist(df['Price'])
plt.title("Price Histogram")
plt.xlabel("Price")
plt.ylabel("No of cars")
plt.show()

 

 

Step-by-step explanation

We used 3 different libraries to complete this assignment. Pandas for data reading and matplotlib.pyplot and seaborn for graphing.

In our work we identified that (5) there are no cars that driven more than 2,50,000 KM. From the box plot (6) we got a insight that petrol cars are less mileage than Diesel and CNG cars. In the histogram (7) midrange priced vehicle have high sale. The plot have shoot up and then as price increases no of cars decreases.