We are scheduling all our events through our “Research Triangle Analysts” meetup group
http://www.meetup.com/Research-Triangle-Analysts/. You can also view all events in calendar form. We have a meeting on the third Thursday of every month at 6:30 p.m. (location and topic to be announced). We also have a monthly lunch on the first Friday of every month at noon (location to be announced).
UPCOMING EVENTS – 2014:
Friday, November 7th, 2014 – noon (monthly lunch):
Thursday, November 20th, 2014 – 6:30 p.m. (regular meeting):
## ARCHIVED 2014 Events
Thursday, October 16th, 2014 – 6:30 p.m. (regular meeting):
Location: Chow, 311 Creedmoor Rd, Raleigh, NC 27613
Steve Geringer presented: “How to Build Effective Machine Learning Applications”
Machines are getting smarter every day. How do they it? What will be left for the humans once the machines completely take over? Learn how you can contribute to the subjugation of mankind by building your own machine learning applications. While we won’t cover sci-fi or philosophical aspects, we will cover many important technical considerations for building effective machine learning (ML) applications.
Steve Geringer is a triangle area software consultant and ML enthusiast.
Friday, October 3rd, 2014 – noon (monthly lunch):
Location: Cafe Carolina and Bakery, 137 Weston Pkwy, Cary, NC 27513
Thursday, September 18th, 2014 – 6:30 p.m. (regular meeting):
Location: RENCI, 100 Europa Drive, Suite 540, Chapel Hill, NC
Elizabeth Claassen presented: “Improved Inference in Generalized Linear Mixed Models”
In small samples it is well known that the standard methods for estimating variance components in a generalized linear mixed model (GLMM), pseudo-likelihood and maximum likelihood, yield estimates that are biased downward. An important consequence of this is that inferences on fixed effects will have inflated Type I error rates because their precision is overstated. We introduce a new method for estimating parameters in GLMMs that applies a Firth bias adjustment to the maximum likelihood-based GLMM estimating algorithm. We apply this technique to one- and two-treatment logistic regression models with a single random effect. We show simulation results that demonstrate that the Firth-adjusted variance component estimates are substantially less biased than maximum likelihood estimates and that inferences using the Firth estimates maintain their Type I error rates more closely than the standard methods.
Friday, September 5th, 2014 – noon (monthly lunch):
Location: McDaids Irish Pub & Restaurant, Hillsborough Street, Raleigh, NC
Thursday, August 21st, 2014 – 6:30 p.m. (regular meeting):
Location: Blue Cross Blue Shield, 5901 Chapel Hill Blvd, Chapel Hill, NC
Laurel Trantham presented: “Utilization and Substitution of Urgent Care, Emergency Departments, and Primary Care Physicians”
Blue Cross Blue Shield is always looking to reduce healthcare costs. One driver of high costs is that many individuals receive medical care at emergency rooms when urgent care centers and primary care offices may be more appropriate sites of care. Laurel Trantham reviewed some of the analysis in this area, including why this is important to explore, and discussed several modeling approaches being considered.
Friday, August 1st, 2014 – noon (monthly lunch):
Location: MEZ Contemporary Mexican Restaurant, 5410 Page Road, Durham, NC
Thursday, July 17th, 2014 – 6:30 p.m. (regular meeting):
Location: Cameron Village Regional Library
Brian Fannin presented: “Statistics Without Borders”
Brian Fannin shared his experience as part of ‘Statistics Without Borders’ (http://community.amstat.org/statisticswithoutborders/home/). The team spent a week in Africa teaching R and statistical modeling to members of the Rwandan Biomedical Center.
Brian is not a proper statistician (he’s an actuary), but he loves R, loves to travel and loves to try and make the world a better place through data. He especially loves doing all three at once.
Friday, July 11th, 2014 – noon (monthly lunch):
Location: An, 2800 Renaissance Park Place, Cary, NC 27513 (I-40 exit 287)
Thursday, June 19th, 2014 – 6:30 p.m. (regular meeting):
Location: Cameron Village Regional Library
Mason DeCamillis presented: “Introduction to Julia”
Julia is a relatively new programming language that aims to blend the good parts of Matlab, C, R, and Python (with fewer of the bad parts). Its growth in popularity make it an increasingly promising option for programmers doing technical, computationally-intensive work. This presentation explored the advantages of Julia in a data analysis context, with examples from both the base library and several user-written packages. Additional information is available at http://julialang.org/ and http://julia.readthedocs.
Mason DeCamillis is a statistical programmer and data analyst with a Master’s degree in Applied Statistics and a knack for crashing his computer by testing out experimental software. He is cautiously enthusiastic about Julia (see http://www.meetup.com/Triangle-Julia-Users/ ), and is excited to share with Research Triangle Analysts.
Friday, June 6th, 2014 – noon (monthly lunch):
Location: City Beverage, 4810 Hope Valley Rd, Durham, NC 27707
Thursday, May 15th, 2014 – 6:30 p.m. (regular meeting):
Location: Cameron Village Regional Library
Joseph Morgan presented: “Covering Arrays”
Software (and analytical model) testing may require considering hundreds or thousands of parameters. Usual “test all” or “full factorial” methods can require too many runs to be practical. Covering arrays make it possible to consider “full coverage” of a software suite with a smaller number of runs (see http://math.nist.gov/coveringarrays/coveringarray.html ).
Joseph Morgan, Senior Software Developer for JMP at SAS Institute, presented his research on this important field.
Friday, May 2nd, 2014 – noon (monthly lunch):
Location: Dales Indian Restaurant, RTP location: 5410 Nc Highway 55, Durham, NC 27713
Thursday, April 17th, 2014 – 6:30 p.m. (regular meeting):
Location: Mattie B’s Public House
Have an Idea, Need an Idea!
1) Come in with an idea you’d like to discuss – either a problem that you’re stuck on, or a great idea where you’d like some feedback.
2) Be ready to present to a small group of about 4-6 people while you enjoy the great food and craft beer at Mattie B’s. This is a sit down presentation. You can bring your laptop and show some code if you want, but this is mostly a chance to “think out loud” with some interested folks.
One of the biggest interest areas from the Feedback survey was “Want to connect with peers,” but the social events got the most votes for “least favorite” meetings. This is a chance to find people who are interested in some of your favorite topics!
Friday, April 4th, 2014 – noon (monthly lunch):
Location: Shiki – Sushi & Asian Fusion
Thursday, March 20th, 2014 – 6:30 p.m. (regular meeting):
Location: MetLife has generously provided space for this event.
Dan Kelly presented: “Random Forests and Boosted Trees”
One of the most-used predictive modeling techniques, the decision tree has a lot of great interpretation as well as predictive properties. But single decision trees can overfit your data and give misleading results. How do you decide when the tree has enough “branches”? Enter the random forest.
We had a discussion on our new mission statement at this meeting (attendants shared with us how they would like us to serve them and what they envision the Research Triangle Analysts to become in the future). After party at MEZ Contemporary Mexican Restaurant.
Friday, March 7th, 2014 – noon (monthly lunch):
Location: Bonefish Grill
We brainstormed on starting a nonprofit organization.
Thursday, February 20th, 2014 – 6:30 p.m. (regular meeting):
Location: Cameron Village Regional Library
Lucia Gjeltema presented: “Community Detection”
Network graph analysis is a hot topic in social media, fraud detection, and academia. In many applications, networked individuals end up on one large “clump”, making further analysis nearly impossible. Community detection is one way to break a huge graph into small meaningful groups for real-world analyses.
Various structural definitions of graph communities were introduced and an overview of algorithms that capture them was given. The presentation was concluded with a review of performance metrics that compare detected communities with ground-truth information.
Friday, February 7th, 2014 – noon (monthly lunch):
Location: Backyard Bistro
We discussed starting a nonprofit organization.
Thursday, January 16th, 2014 – 6:30 p.m. (regular meeting):
Location: Cameron Village Regional Library
Tim Hopper presented: “Intro to Scikit-Learn”
Scikit-learn is an actively developed Python package providing an implementation of many machine learning algorithms (e.g. SVM, kNN, linear models, HMM, k-Means, spectral clustering). However, the benefits of Scikit-learn goes well beyond carefully implemented learning algorithms. Being built in Python, it allows easy integration with countless other Python modules for tasks such as plotting, data munging, and application development. Its consistent API across algorithms allows for rapid experimentation with multiple learning methods. Also, Scikit-learn is well documented and provides lots of examples.
Instead of discussing particular machine learning algorithms provided by the package, I will focus on Scikit-learn and Python as a toolkit for solving data problems from start to finish. I will emphasize the Pipeline tool which allows the user to chain together all the steps of a machine learning pipeline including preprocessing, dimensionality reduction, feature selection, and model fitting.
## ARCHIVED 2013 Events
Thursday, November 21st, 2013 – 6:30 p.m. (planning meeting):
Location: Larry’s Southern Kitchen
Plan next year! This has been a great year for RTA. We now have 100 members on Meetup, and we’ve had some amazing speakers and guests. Help us plan to make the group even bigger and better next year.
Friday, November 1st, 2013 – noon (monthly lunch):
Location: BurgerFi – Cary
November is RTA planning month! Join us for a lunchtime roundtable on where the analytics field is heading and what we should do next year.
Thursday, October 17th, 2013 – 6:30 p.m. (regular meeting):
Location: Buffalo Brothers on Lake Boone Trail
4025 Lake Boone Trail Suite #100, Raleigh, NC 27607
Dahl Winters presented: “Scaling the Big Data Mountain”.
In this whirlwind hour I will attempt to blaze a trail through the wilderness that is big data science. Given a mountain of unstructured data and the jungle of options in the Hadoop ecosystem, it can be difficult to know which tools to use for which investigations. We will take a guided tour of the most common Hadoop use cases, peer into NoSQL and graph databases, march over to machine learning, avoid sinking into deep learning, and cover some of the classification and clustering algorithms I’ve worked with in my big data explorations. If you can survive this hour unscathed, you will be that much more prepared to tackle your own big data mountain.
Thursday, September 19th, 2013 – 6:30 p.m. (regular meeting):
Location: Cisco Systems 7200 Kit Creek Rd; Morrisville, NC
Building 11, First Floor, Conference Room E-UNION
“Big Data Analytics and CyberSecurity”
No food at this event. After party instead at Trali Irish Pub’s NEW LOCATION.
Big data is expected to play a crucial role in the cybersecurity landscape. Learn how the security industry is using big data analytics and integrating Artificial Intelligence techniques (statistical analysis, autonomic/agent-based computing, ensemble classification, game-theoretic self-optimization) within the framework of distributed, intelligent, and forward-thinking security architecture. For example, Cisco is using these techniques to create solutions in the domain of Network Behavior Analysis (NBA), in order fight against modern sophisticated attacks in today’s cyberspace, including Advanced Persistent Threats (APT), exploit kits, zero-day attacks, molymorphic malware and trojans inside the client’s network.
Thursday, August 15th, 2013 – 6:30 p.m. (regular meeting):
“Tool Throwdown: Kaggle competition – Titanic dataset”
RTA founders demonstrated their predictive modeling skills using their favorite statistics and programming tools. On display will be SAS, R, JMP, and maybe more!
Description: Analyzing the Titanic data set from the Kaggle competition.
Thursday, July 18th, 2013 – 6:30 pm (regular meeting):
Location: This month’s event space has been graciously provided by the Institute for Advanced Analytics at North Carolina State University.
Oscar Boykin presented: “Sketching and Streaming: building large-scale, real-time relevance features at Twitter”.
We will discuss approximation algorithms for fast, cheap and accurate aggregation, which are used in production at Twitter. We will also briefly cover the open source software we released to do this: scalding, algebird and storm.
Dan Kelly presented: “Assessment and Comparison of Predictive Models with Binary Targets”, a practical guide for people who are doing predictive models.
Oscar Boykin is a native of Raleigh. He is currently on the analytics infrastructure team at Twitter, and co-creator the Twitter open source projects: scalding, algebird, bijection, chill, and summingbird.
Thursday, June 20th, 2013 – 6:30 pm (regular meeting):
Location: Saladelia (in their back room)
4201 University Drive, Durham, NC 27707
Ian Cook presented: “Workshop on submitting R jobs to the cloud”
Bring your laptops, enjoy the wifi and great food, and talk about data! Saladalia is located at .
Thursday, May 16th, 2013 – 6:30 pm (regular meeting):
Location: The Cuban Revolution, 318 Blackwell St, Durham, NC
Social / Networking meeting.
Adam Sobsey presented: “Sabermetrics”
Adam writes for Baseball Prospectus, one of the premier publications for baseball statistics. “Our way of understanding baseball has undergone a revolution during the last generation. The field of baseball study known as “sabermetrics” (based on the acronym of the Society for American Baseball Research) has made huge advances in our approach to the complexity of the game, much of it via more thorough and sophisticated statistical analysis (aided by technological innovations, as well). Among the results of all this study is the essential sabermetric concept of the “Replacement Player.” The Replacement Player is an important but somewhat nebulous platonic ideal. The prevailing agreement is that he is basically good enough to play at the Triple-A minor-league level — the highest level below the major leagues — but does not have the skills to succeed for long stretches in the majors themselves. As it happens, the Durham Bulls are a Triple-A baseball team, all of its players striving to surpass and escape “replacement level” baseball. My talk will discuss some of the ways in which sabermetrics has changed our understanding of the game of baseball for the good, and some of the ways in which that understanding is still a work in progress–all against the very real backdrop of the men playing the game itself.”
Tuesday, April 2nd, 2013 – 6:30 pm (special event):
Location: Cuban Revolution, 318 Blackwell St, Durham NC
John D. Cook: Information is Cheap, Meaning is Expensive:How to Hire and Work With an Analyst (without breaking the bank)
More and more companies are investing in information, through better databases and more robust data tools. Many are finding, however, that extracting meaning from all that information is more difficult than they thought. There are many analysts who can assist–either as freelancers or employees–but how do you know you’re hiring the right talent? Should you hire a fill-time analyst or a contractor? How much should you pay? What skills should they have?
John D. Cook has over 20 years of experience applying mathematics to real-world problems. He has worked with firms large and small, using his skills and expertise to turn the data they have into the information the need. During this question and answer session, Johnwill discuss how to connect with the right talent, how to budget for an analysis project, and what to expect from an expert analyst.
Friday, April 5th, 2013 – noon (monthly lunch):
Location: Serena’s in RTP, 5311 South Miami Blvd, Durham NC
Michael Blanks presented: “Open Data & Government”.
Thursday, March 21st, 2013 – 6:30 pm (regular meeting):
Location: Louie & Charlie’s Grille & Tavern
John Sall presented: “From Big Data to Big Statistics”.
When you scale up the analysis, you have a lot of issues to address. When you have a lot of data, even a small difference is significant. When you screen a lot of hypotheses, adjusting for selection or multiple test bias is an issue. When you have a lot of bad data, making the analysis automatically robust becomes important. When you have big data, you need to make the computer work fast to get the job done. When you have thousands of results, you need to create compact summaries to show you all the results in one page, or at least produce the results sorted by significance. All these issues need to be resolved and the solutions encapsulated into a workflow for engineers and scientists that deal with more data each year.
John Sall is a co-founder and executive vice president of SAS Institute. He leads the JMP Division of SAS.
Friday, March 1st, 2013 – noon (monthly lunch):
Bruce Connor led a discussion on “analytics for polling data”.
This discussion was focused around and the methods behind Nate Silver’s election predictions. Participants were invited to discuss the methods and their experience with other applications of the same methods.
Melinda Thielbar presented: “Data Science is not a Fad. Let’s Keep it That Way”.
This presentation discusses the technical details of data science, in context with time series analysis and statistical modeling. A really good presentation for anyone interested in a hype-free primer on data science.
Friday, February 1st, 2013 – noon (monthly lunch):
Location: Neomonde, 10235 Chapel Hill Road, Morrisville, NC 27560
Linda Schumacher presented: “Running a Kaggle Competition team“
RTA will be organizing a Kaggle team this year! Anyone who is interested in joining the team or just learning more about Kaggle will benefit from this meeting.
Eric Yount presented: “Analytic Methods for Clinical Data“
This will be very informative for those who are primarily working in data mining and business analytics. The techniques Eric will discuss and the reasoning behind them present a different way of looking at data. Clinical trials experts will have an opportunity to discuss the process of collecting and analyzing clinical data.
## ARCHIVED 2012 Events
Thursday November 15th, 2012 – 6:30 PM
“Looking Ahead to 2013″
Thursday October 18th, 2012 – 6:30 PM
“Educating Analysts: How Can Schools Prepare Students for a Quantitative Career?”
by Bill Burpitt, Associate Dean at the School of Business, Elon University
Thursday September 20th, 2012 – 6:30 PM
Location: Cuban Revolution
Thursday August 16th, 2012 – 6:30 PM
“Why the Future Will Convert Better” by Martin (Marty) Smith
Thursday July 19th, 2012 – 6:30 PM
Location: Cuban Revolution
Thursday June 21st, 2012 – 6:30 PM
“Applications of R and R Mini Hack-A-Thon”, led by Ian Cook, TIBCO Spotfire
Thursday May 17th, 2012 – 6:30 PM
Presentation by MaxPoint Interactive
Location: MaxPoint Interactive
Thursday April 19th, 2012 – 6:30 PM
Location: Wild Wing Cafe in Brier Creek
Thursday March 15th,2012 – 6:30 PM
“Have an Idea, Need an Idea”
Location: Earth Fare
Thursday February 16th, 2012 – 6:30 PM
Location:Trali Irish Pub
Thursday January 19th, 2012 – 6:30 PM
First Triangle Analysts Social/Networking Meeting !
Location: Trali Irish Pub