Skip to content

Charlotte1904/Meetup-Trends-Prediction

Repository files navigation

PROJECT  OBJECTIVES

  • Built a production-ready streaming and predicting pipeline using MeetupAPI and AWS
  • Predicted in real-time the next industry-specific trend and tracked its popularity over time through interactive time-series graphs
  • Discovered the emergence of a new community (category) in a city using Network Graphs
  • Implementing Time Series to see the rise of a topic and how different topics evolve over time

Methodology 

Data Architecture

logo

  • Example of raw rsvp streaming file logo

  • For more information click here .

Analysis


Category Distribution - San Francisco


To explore trending activities in San Francisco, group categories are extracted from event information. The chart below shows most meetup focus on socializing, business/networking, language meetups and technology.

NETWORK GRAPH


The visualizions are screenshots of interactive network graphs (using plotly) illuminates relationships between topics and segment them into subgroup (industry/categories).

  • Topics here are the hashtags, keywords that organizers use to describe events

A graph of the tech landscape in San Francisco

The basic idea is that topics in the same tech field will often be mentioned by the same Meetup groups. For example, if the business challenge of creating value from big data requires the combination of database technologies, analytics methods and parallel processing frameworks, these topics are likely to be of interest to the same practitioners. As a consequence, we would expect to find them mentioned by the same groups, in a way that defines a ‘data’ technology field and its community of practitioners.

Topic Network where topics that are often mentioned together are linked and “pulled together” (see graph below).

  • The color of nodes identify how important nodes are within a graph. Darker-color nodes at the center are general topics (technology, tech, software, . . ) that can be used to describe any tech event.

  • Slightly-darker nodes on the outside is the central node for subgraphs. They indicate natural groupings within the data. This supports community detection - detection of subgraphs where a set of nodes is densely connected internally and sparsely connected externally.

San Francisco Activity Landscape

Using the similar concept for topics described in all events happening in San Francisco. We can see the segments are clearly and more "pulled-apart" indicating that these subgroups are more distinct from each other.

TIME SERIES


With the topic networks, we have a power to visualize the emergence of new field.

For example, fifty years ago, technology topics would focus on hardware and IT. Nowadays, technology is divided into Big Data, Software Engineer, Graphic Designs, Gaming, Virtual Reality. I assume that in the next few year, there will be a new subgroup such as Robotics with new topics. We can see how it emerges and grows overtime.

We then track its activity overtime to see its popularity using time series.

Below are some of the examples of topic popularity for the last few months.

Top Trending Topics - San Francisco


DataScience Topics - San Francisco


Labled-Category Groups

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published