Seminar on Semantic Solutions at Cerner

Sumanth Kalli

Karthik Vishwanath, team leader at the Cerner Corporation, gave a guest lecture Oct. 30, titled “Semantic Solutions at Cerner” in Flarsheim Hall. The seminar was conducted by Dr. Yugyung Lee, associate professor from the School of Computing and Engineering.

Vishwanath leads a team at Cerner that build indexes, services and infrastructures for making electronic medical records and population health systems easier to search. This is known as semantic solutions. He has been building distributed systems in the healthcare domain for more than five years. His team also builds infrastructures to build custom search data sets.

“As we’ve gotten deeper into cloud technologies at Cerner, the range of options to make things better has expanded drastically,” Vishwanath said. “Through this process, our team has built up a cloud scale search platform that we call Brahe.”

The large scale flexible indexing platform, Brahe, uses open source projects like Apache Zookeeper, Apache HBase and Apache Solr.

“This talk will focus on a little about each of these technologies and why we chose them to build a system that not only scales but clears away a good deal of operational headaches,” Vishwanath said. “Reading a 200 page patient record is not that easy, and we want to make the patient record searchable. So, we initiated this project four [or] five years ago and it is a long journey. Dated back to 2008, the tools are not as mature as they are now and we made lots of mistakes and learned from them. But, it is working well for us.”

In his presentation, Vishwanath started with the differences between batch processing and real-time processing of data. Batch processing consists of the relational database management systems, and a crawler that detects any changes in the data. Batch processing also uses raw tables to organize data, which are converted down into semantic indexing tables. However, batch processing cannot be applied for real-time processing.

To overcome batch processing’s lack of real-time data processing, Cerner uses Apache Storm and the program Hadoop in different topologies, and scheduling also depends on the topology. Java is typically used for this purpose.

“The goal is to complete this process in 200 microseconds and we are now able to achieve somewhere between 200 and 300 microseconds,” Vishwanath said. “At Cerner, we use Oracle databases and our own server databases to make it easy for crawlers to guess the type of data which is coming in. At Cerner, we are very familiar with Apache HBase. It is pretty decent and it gave us all we need.”

During the synopsis, Karthik explained his presentation in five steps: keep it simple, disconnect stages, keep your points at minimum, organize data around queries and use what one is good at.

“We are doing excellent work by contributing to open source, which is great, and we have collaborated with consulting firms,” Vishwanath said. “It is good to see Netflix and Twitter open sourcing. We are planning to open source Brahe soon. We are planning to add three or four projects this year into the Cerner’s Github account and will be adding more in the years to come.”

Vishwanath spoke about the present job market and qualifications.

“At Cerner, they try to see if a candidate has good computer science fundamentals, how well you can understand core issues, some coding and the projects along with their implementations in your resume,” Vishwanath said.

“Cerner Corporation is hiring a lot of people, and even for summer internships, Cerner is a good place to learn. We also have a separate department who looks after sponsoring H-1B visas for the International students,” Karthik said.

Vishwanath received his master’s degree in Computer Science from UMKC in 2008. He has been published in a number of publications, including the Journal of the American Medical Informatics Association and has presented at conferences such as the American Medical Informatics Association, Association for the Advancement of Artificial Intelligence and Symposium on Applied Computing. He also won the American Medical Informatics Association’s Distinguished Paper Award in 2005.