Welcome to Analytics and Data Summit 2020.
  • Sessions appear in the color of their primary track and can be filtered using Products on the right
  • Use the Search bar for more flexibility
See this link for hints on how to search the schedule
Back To Schedule
Wednesday, February 26 • 10:05am - 10:55am
Constructing a Large ADW Graph on Railroad Data & Visualizing that Graph in OAC

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Feedback form is now closed.
To gain experience and grow skills with the Spatial & Graph library that is now available in Oracle's Autonomous Data Warehouse (ADW), we have built a network graph on top of railroad data that the government of India makes available at https://data.gov.in/resources/indian-railways-time-table-trains-available-reservation-01112017. We suspect that our experiences will be of interest to others desiring to build a large ADW graph for other use-cases in other industries, and the following provides a progress report with lessons learnt.

The movements of the many trains that travel across this railroad network also trace a graph whose nodes are train stations and whose edges are the connecting railroad segments. The transactional input data details the motions of roughly 10K distinct trains as they execute about 200K trips along 20K distinct rail segments that connect 10K railroad stations, all in a single day in India. So our first step is to compose an Oracle Machine Learning (OML) notebook that distills that transactional input data down to two aggregate ADW tables: one table listing the 10K nodes and their summed properties (total number of trains visiting each station & number of edges radiating from each station), and an edges table containing 20K records detailing mean train speed and distance between adjacent stations. Building the graph in ADW is then a blissfully simple SQL one-liner.

We then compose another OML notebook to ask the usual sorts of questions of this graph, such as: what is the shortest path (time-wise and distance-wise) connecting two arbitrary stations. The resulting path is then decorated with all adjacent stations that are one or two train-rides away, with those results then exposed to Oracle Analytics Cloud (OAC) as derived ADW tables. Note that the graph of the entire India railroad network is too large to visualize in OAC (the 10K edges and nodes overwhelms an OAC canvas), so we used a variety of methods to filter this graph in a way that extracts this railroad network's main artery containing the busiest nodes and edges.

This effort's final task will be to sprinkle this ADW graph with numerous virtual traveling agents, and to train a machine-learning algorithm to direct the movements of these agents across this graph in a way that maximizes the virtual rewards gathered by these agents while minimizing travel time and expense. Progress achieved on this final task will be reported at conference time, where we will also show how our solution to this Pacman-like problem is relevant to ADW users in sales and finance.

avatar for Siddesh C Prabhu Dev Ujjni

Siddesh C Prabhu Dev Ujjni

Staff Solutions Engineer, Oracle
Siddesh is an Oracle Cloud Solutions Engineer, primarily working on Machine Learning, Artificial Intelligence and Advanced Analytics since 2018. He is working with Oracle Labs on Oracle patented Machine Learning Algorithm (MSET2),ensuring the algorithm gains momentum and leads to... Read More →
avatar for Dhvani Sheth

Dhvani Sheth

Senior Solutions Engineer, Oracle
Dhvani is a Senior Solutions Engineer at Oracle specializing in Oracle Machine Learning and Analytics Cloud. She designs and develops solutions for customers to help them achieve their technological goals. She is the lead for Ballerina-OCI (Oracle Cloud Infrastructure) module. Her... Read More →

Wednesday February 26, 2020 10:05am - 10:55am PST
Bldg 03- Auditorium&General .