Seaborn:Datasets Explorations

Ömer Şenol
6 min readDec 21, 2020

--

You can find the Turkish version of this article by click here.(Türkçe)

The Seaborn library in Python is often used for data visualization. There are certain data sets in this library. We can upload these data sets to our system from Seaborn.

Before doing certain operations with these data sets, we need to know what the story of the data set we are using is. So today I will tell you about the information of Seaborn library data sets.

car_crashes

Accidents in the states of the USA are examined. This is the data set of the cause of the accidents and the cost to the accident insurance companies.

· total -> Number of drivers involved in fatal collisions per billion miles (5.900–23.900)

· speeding -> Percentage Of Drivers Involved In Fatal Collisions Who Were Speeding (1.792–9.450)

· alcohol -> Percentage Of Drivers Involved In Fatal Collisions Who Were Alcohol-Impaired (1.593–10.038)

· not_distracted -> Percentage Of Drivers Involved In Fatal Collisions Who Were Not Distracted (1.760–23.661)

· no_previous -> Percentage Of Drivers Involved In Fatal Collisions Who Had Not Been Involved In Any Previous Accidents (5.900–21.280)

· ins_premium -> Car Insurance Premiums (641.960–1301.520)

· ins_losses -> Losses incurred by insurance companies for collisions per insured driver (82.75–194.780)

· abbrev -> USA states

diamonds

A few diamonds are taken and their properties are analyzed and created dataset.

· carat -> Weight of the diamond (0.2–5.01 gram)

· cut -> Diamond cut quality (Fair (Worst), Good, Very Good, Premium, Ideal (Best))

· color -> Diamond colour (J (Worst) , D (Best))

· clarity -> A measurement of how clear the diamond is (I1 (Worst), SI2, SI1, VS2, VS1, VVS2, VVS1, IF (Best))

· depth -> Total depth percentage (z / mean(x, y) = 2 * z / (x + y) (43–79))

· table -> Width of top of diamond relative to widest point (43–95)

· price -> Price in US dollars (326$ — 18,823$)

· x -> Length in mm (0–10.74)

· y -> Width in mm (0–58.90)

· z -> Depth in mm (0–31.80)

exercise

A group of people is taken and put into an experiment. The aim of this experiment is to measure the pulse rate of people during their actions, depending on whether they consume fat while dieting.

· id -> Number of person being tested (1–30)

· diet -> How much fat is fed while dieting(no fat,low fat)

· pulse -> Person’s heart rate(80–150)

· time -> Time elapsed during his/her action

· kind -> His/Her situation while being tested

flights

Dataset that keeps the number of people killed in plane crashes by year and month.

· year -> Year of accident(1949–1960)

· month -> Month of accident

· passengers -> Total passengers (104–622)

fmri

A data set that captures and processes signals from a device connected to the brain of living things.

· subject -> The person the data is collected from (there are 14 living beings, each of which is put to the test 76 times)

· timepoint -> Time points(0–18)

· event -> Expressing how data is collected

· region -> It refers to the region in the brain where the signal is collected

· signal -> The strength of the incoming signal (-0.255–0.564)

geyser

It is a data set that holds the properties of geysers found around the world.

· duration -> The time to spray hot water into the air in seconds (1.6–5.1)

· waiting -> Time to wait for hot water spraying again (43–96)

· kind -> Type of geyser(waiting<70: short , waiting=>70:long)

iris

Leaf lengths and widths are used to distinguish between flower types belonging to a particular genus. It is the data set that holds these features according to the species.

· sepal_length -> Sepal length in centimeters (4.3–7.9)

· sepal_width -> Sepal width in centimeters (2.0–4.4)

· petal_length -> Petal length in centimeters (1.0–6.9)

· petal_width -> Petal width in centimeters (0.1– 2.5)

· species -> Type of plant

mpg

It is the data set that the cars are listed according to their features.

· mpg -> Miles driven by 1 gallon of gasoline (1 galon = 3.7 L , 1 mil = 1.6 Km)(9–46.6)

· cylinders -> The number of cylinders in the car (3–8)

· displacement -> Engine capacity (68–455)

· horsepower -> Horse power (46–230)

· weight -> Weight (1613–5140)

· acceleration -> Time to reach speed 0–100 km / h in seconds (8–24.8)

· model_year -> Car model year (1970–1982)

· origin -> Production place of the car

· name -> Model name of the car

penguins

It is a data set by taking specific penguins and measuring their body features.

· species -> Type of penguin

· island -> Island where the penguin is located

· bill_length_mm -> Length of beak in millimeters 32.1–59.6)

· bill_depth_mm -> Depth of beak in milimeters (13.1–21.5)

· flipper_length_mm -> Fin length in millimeters (172–231)

· body_mass_g -> penguin weight in grams (2700–6300)

· sex -> Gender of the penguin(Male , Female)

planets

It is a data set about galaxy exploration published by NASA.

· method -> Name of the method used to find galaxies

· number -> The number of planets in the galaxies found (1 -7)

· orbital_period -> It is a technical statement. Indicates the orbital period (0.09–730000)

· mass -> Mass of galaxy (0.003–25)

· distance -> The distance of the galaxy found to our galaxy (1.350–8500)

· year -> The year the galaxy was discovered (1989–2014)

tips

It is a data set in which the values ​​associated with the total paid account in a restaurant are kept.

· total_bill -> The amount of the account paid in dollars (3.07$ — 50.81$)

· tip -> Tip amount ( 1$ — 10$)

· sex -> Gender of the person paying the bill (Male , Female)

· smoker -> Is smoking on the table? (No , Yes)

· day -> Which day people arriving (Thur,Fri,Sat,Sun)

· time -> Time of day(Lunch, Dinner)

· size -> Number of people sitting at the table (1–6)

titanic

The data set that keeps statistics of passengers on the Titanic ship. There are 891 passengers in the data set.

· survived -> Did he/she survive? (0 : No , 1: Yes)

· pclass -> Refers to upper class passengers (1(Best) , 2 , 3(Worst))

· sex -> Passenger’s gender (Male , Female)

· age -> Passenger’s age (0.42 (Babies under 1 year old) , 80)

· sibsp -> The number of brothers and sisters on board the passenger (0–8)

· parch -> The number of relatives of the passenger on board (0–6)

· fare -> The amount the passenger paid for the ticket (0–512.3292)

· embarked -> The door through which the passenger boarded the ship (S (Best))

· class -> Passenger class(First(Best) , Second, Third(Worst))

· who -> Distribution of passenger (Child, Man, Woman)

· adult_male -> Whether the passenger is a adult man (True , False)

· deck -> The deck on which the passenger stands

· embark_town -> The city where the pier is located where the passenger boarded the ship

· alive -> Whether the traveler survived (no , yes)

· alone -> Whether the passenger is alone (True , False)

--

--

No responses yet