Imagine if you could create an event that was able to generate over a 4,000% increase in attendance over five years. Well, that’s exactly what DataStax has done with the annual Cassandra Summit in the Bay area.
4,000% increase in attendance
This past week was the 2015 Cassandra Summit, held in Santa Clara, was the largest NoSQL conference in the world. Prior to the start, there were over 6,100 registrations onsite. Considering that the very first Cassandra Summit was only five years ago, it had a whopping 145 attendees. What an amazing testament to the strength of Cassandra in the community.
Largest NoSQL conference in the world
There were many huge names that helped sponsor the event this year which allowed for a free admission price. If you were lucky enough to make the annual trek to California, you were overwhelmed with nearly 150 sessions of fantastic content by some of the brightest members in the community. All volunteer work too.
This was the first year that DataStax was flooded with submissions of people wanting to have the opportunity to speak at the Summit. They actually had to turn people’s paper submissions down. Amazing! I feel so lucky to have gotten the chance to attend. The networking, education, and sense of community is over-whelming
Every year, the Summit is kicked off by a keynote that includes “Where the Industry is Going” speech by Billy Bosworth, and a “State of Cassandra” speech by Jonathan Ellis. This year, there was a new-comer to the stage. Microsoft’s EVP of Cloud, Scott Guthrie, was there to tell the community that Microsoft has joined in Cassandra fun. Wait, what?
Microsoft & Cassandra sound like oil & vinegar. Even Billy pointed out that most of us in the community, are all die-hard Unix users that loathe having to get on a Windows platform. Scott was there to sway our opinion as Microsoft is changing it’s course to favor the open source community.
This year, there were so many sessions for learning that you really had to be selective in what you could attend. A lot of the sessions that I tried to attend were so popular that they had to close when maximum room occupancy was reached. DataStax anticipated this and provided attendees to a chance to purchase a Priority Pass. It would give you first access to all sessions, which could get you into the popular sessions before they closed. Great idea, if you ask me.
I chose to not attend a ton of the sessions. I took the opportunity to mingle and network amongst the crowd as much as I could. It also gave me a chance to meet people in-person that I’ve interacted with online for years. Getting to catch up and share a beer with so many Cassandra-ians in a way only the Summit can provide.
Some of the sessions I tried to attend were closed due to popularity, but I was able to make it into others. Probably the most memorable one, too me, was CrowdStrike’s “1 million writes per second on 60 nodes with Cassandra and EBS”. What a turn in traditional thinking that you can’t do high performance on Elastic Block Storage. Up until now, it was Netflix’s benchmark that was the standard for how to achieve 1 million writes per second. But it required a lot more nodes that 60 and had to use SSDs to get that kind of high IOPS. Just goes to show what the community is capable of.
In addition to the sessions, DataStax provided a day before the Summit for people to get take Cassandra certification tests. Just in that one day, over 500 people were able to receive certifications like Cassandra Certified Developer & Certified Administrator.
Every year, DataStax tries to record most of the Summit’s sessions so they can post them to Planet Cassandra for the community to enjoy. That way you are able to see the sessions that you missed out on. Some of the breakouts and booth presentations are not recorded. Those you just have to be there to see live. All the recordings & slides take a little while to gather from the speakers, so it’s not accessible for a while after the Summit. A lot of the speakers will post their slides online somewhere before that the sessions are published. I scoured the internets for all the slideshares I could find. Of the nearly 150 sessions, I was able to find slides for only 40.
Here is the list of available slides (in no particular order). Enjoy!
- Billy Bosworth, Jonathan Ellis, & Scott Guthrie – Keynote
- Nate McCall – Hardening Apache Cassandra for Compliance (or Paranoia)
- Evan Chan – Breakthrough OLAP
- Caleb Rackliffe – Intro to DSE Search
- Carl Yeksigian – Materialized Views
- Avi Kivity – Scylla 1 Million CQL operations per second per server
- João Paulo Eiti Kimura – A Change of Seasons
- Aaron Morton – Steady State Data Size with Compaction Tombstones and TTL
- Aaron Morton – Repeatable, Scalable, Reliable, Observable Cassandra
- Mick Semb Wever – Distributed Tracing from Application to Database
- Carlos Alonso – Troubleshooting Cassandra performance issues as a developer
- Gary Stewart & Christopher Reedijk – Exploiting Hotel Cassandra
- Rekha Joshi – Reporting From Trenches: Using Cassandra Effectively
- Russell Spitzer – Spark Cassandra Connector: Past, Present, and Future
- Robbie Strickland – Lambda at Weather Scale
- Sebastián Estévez – Lessons from >100 Startups
- Ben Bromhead – Securing Cassandra
- Benjamin Lerer – A deep look at the cql where clause
- Roopa Tangirala – Netflix’s Big Leap from Oracle to C*
- Jeff Jirsa – Real World DTCS For Operators
- Martin Zapletal – Cassandra as event sourced journal for big data analytics
- Joel Knighton – Testing Cassandra Guarantees under Diverse Failure Modes with Jepsen
- Joe Stein – Real-time Log Analysis with Apache Mesos, Kafka and Cassandra
- Julien Anguenot – Leveraging Cassandra for real-time multi-datacenter public cloud analytics
- Jon Haddad – Enter the Snake Pit for Fast and Easy Spark
- Peter Nichol – Why DBaaS is taking off with Cassandra
- Rob Bagby & Jesus Aguilar – Building a massively scalable system with DataStax and Microsoft’s next generation Paas infrastructure
- Vinay Sridhar – Persistent Memory and Cassandra
- Kiyu Gabriel & Adam Mollenkopf – When and Where are all the Things: Geotemporal IoT Search and Analytics
- Frank Ober & Al Tobey – 3D XPoint and NVME Technology Cassandra Storage Comparison
- Vlad Giverts – Building Large Scale Machine Learning Pipelines
- Victor Anjos – Cassandra Installation to Optimization
- Ben Whitehead & Robert Stupp – A New Way to Run Cassandra
- Stephen Mallette – What’s New in Apache TinkerPop
- Chris Fregly – Real time Advanced Analytics with Spark and Cassandra
- Luke Tillman – Relational Scaling and the Temple of Gloom
- Joe Stein – Real-Time Log Analysis with Apache Mesos, Docker, Kafka, Spark, Cassandra, and Solr at scale
- Gustavo Rene Antunez & Carlos Rolo – My First 100 days with a Cassandra Cluster
- Aki Colovic – Skinny on Wide Rows
- Ben Slater – When and how to migrate from a relational database to Cassandra
By Adam Hutson
Adam is Data Architect for DataScale, Inc. He is a seasoned data professional with experience designing & developing large-scale, high-volume database systems. Adam previously spent four years as Senior Data Engineer for Expedia building a distributed Hotel Search using Cassandra 1.1 in AWS. Having worked with Cassandra since version 0.8, he was early to recognize the value Cassandra adds to Enterprise data storage. Adam is also a DataStax Certified Cassandra Developer.