Obviously AI includes a variety of publicly available datasets which you can use for predictions. These datasets will not only help you to to deal with data crunch, but will also be super useful to get an idea of the features/columns that you should be looking at while building your own dataset for your use case.

Superstore Retail Dataset

Artificial intelligence (AI) and machine learning is set to transform the retail industry driving deeper insights into customer behavior, operations, finances, and human resources. Obviously AI gives your retail organization the tools needed to accurately forecast demand and inventory, better understand customer behavior, and optimize staffing, helping you dominate your market and delight customers. This data has been collected from a global superstore and contains data for the past 4 years. This data would be super helpful if you want to predict Sales and forecast Demand.

It contains 50,000 rows and 23 columns.

The various columns that are included in this dataset are as follows:

row_id

order_id

order_date

ship_date

ship_mode

customer_id

customer_name

segment

city

state

country

market

region

product_id

category

sub_category

product_name

order_priority

quantity

discount

profit

shipping_cost

sales

Airbnb Homes Dataset

This Dataset contains the information of the various Airbnb home listings in New York City for the year 2019. It includes the columns like location of the listing, host name, price, geographical coordinates, reviews etc. This dataset can be extremely useful to predict the price of the listing given the location, reviews, type of rooms.

The dataset consists of 48,895 rows and 13 columns.

The features that are included in this dataset are as follows:

id
name
host_name
neighbourhood_group
neighbourhood
room_type
last_review
availability_365
minimum_nights
number_of_reviews
reviews_per_month
calculated_host_listings_count
price

Marketing Campaign Dataset

A marketing campaign involves promoting the products through various channels like newspapers, promotions, television ads etc. Marketing is extremely important for a product to be successful. Targeting the right and high value customers seems to be a challenge. Predictive Analytics and machine learning help to tackle this issue by finding patterns in the buying behavior, customer demographics, and helps identify high-value customers and retain them. The customer and marketing analytics help increase growth and profitability.

The data consists of 9134 rows and 24 columns.

The features/columns included in this dataset are:

Customer
State
Customer Lifetime Value
Response
Coverage
Education
Effective To Date
EmploymentStatus
Gender
Income
Location Code
Marital Status
Monthly Premium Auto
Months Since Last Claim
Months Since Policy Inception
Number of Open Complaints
Number of Policies
Policy Type
Policy
Renew Offer Type
Sales Channel
Total Claim Amount
Vehicle Class
Vehicle Size

Avocado Prices

Avocados are one of the most popular fruits and are cultivated in tropical and Mediterranean climates throughout the world. According to Transparency Market Research (TMR), the global avocado market was valued at $13.64 billion in 2018 and is predicted to attain an overall value of $21.56 billion by 2026.

This dataset has 18,249 rows and 11 columns.

The features/columns included in this dataset are:

ID
Date
AveragePrice
Total Volume
Total Bags
Small Bags
Large Bags
XLarge Bags
type
year
region

FIFA Players

Football is one of the most popular sport and a widely played game in Europe and South America. The use of artificial intelligence and machine learning has been increasing in Sports Analytics. Sports analytics is a field that applies data science techniques to analyze various components of the sports industry, such as player performance, business performance, recruitment, and more.

This data contains information about the players and their demographics such as their clubs, height and weight, and various performance parameters. This can be super useful to compare the performance of various players and make a prediction of which players is in good form and is likely to perform well in the game.

This dataset consists of 18,207 rows and 52 columns.

Some of the features/columns included in this dataset are:

id
player_name
age
nationality
overall
potential
club
wage
special
preferred_foot
international_reputation
weak_foot
skill_moves
body_type
position
jersey_number
height
weight
crossing
finishing

NIFTY 500

The NIFTY is a benchmark stock market index that represents the largest companies listed on the National Stock Exchange (NSE). It is one of the main stock indexes used in India. NIFTY 500 represents the top 500 companies in India's National Stock Exchange (NSE) based on market capitalization and average daily turnover. It represents 94% of free float market capitalization of stocks listed on NSE.

The Nifty 500 Dataset consists of 500 rows and 14 columns.

The features/columns included in this dataset are:

company
industry
symbol
category
market_cap
current_value
high_52week
low_52week
book_value
price_earnings
dividend_yield
roce
roe
sales_growth_3yr

Restaurant Data

This dataset contains the information about the various restaurants based in Bengaluru, India. Bengaluru is considered as Silicon Valley of India and consists of tons of restaurants serving cuisines from different parts of the world. This data will be extremely helpful to get important insights about the kind of food popular in different neighborhoods, the kind of cuisines people prefer, relationship between affordability and popularity.

This Dataset consists of 10,000 rows and 14 columns.

The features/columns included in this dataset are:

restaurant_name
address
location
phone
type
cost
online_order
book_table
rating
votes
dish_liked
cuisines
meal_type
meal_city

Login to your account to use the Data Store.

Login to Obviously AI
Was this article helpful?
Cancel
Thank you!