Data Description

Data Source: 

https://bikeshare.metro.net/about/data/

Data PREprocessing and Data Imputation

We performed initial data pre-processing to clean the data and increase the output accuracy. We removed the redundant data and columns which would not contribute much to the project and selected the most important variables (attributes).

DATA FIELDS DESCRIPTION

  • trip_id: Locally unique integer that identifies the trip
  • duration: Length of trip in minutes
  • start_time: The date/time when the trip began, presented in ISO 8601 format in local time
  • end_time: The date/time when the trip ended, presented in ISO 8601 format in local time
  • start_station: The station ID where the trip originated (for station name and more information on each station see the Station Table)
  • start_lat: The latitude of the station where the trip originated
  • start_lon: The longitude of the station where the trip originated
  • end_station: The station ID where the trip terminated (for station name and more information on each station see the Station Table)
  • end_lat: The latitude of the station where the trip terminated
  • end_lon: The longitude of the station where the trip terminated
  • bike_id: Locally unique integer that identifies the bike
  • plan_duration: The number of days that the plan the passholder is using entitles them to ride; 0 is used for a single ride plan (Walk-up)
  • trip_route_category: “Round Trip” for trips starting and ending at the same station or “One Way” for all other trips
  • passholder_type: The name of the passholder’s plan