RESEARCH PROJECT TOPIC: Database Management Systems (DBMS)

Image result for Database Management Systems

Large-scale computing services revolve around the management, distribution and analysis of massive data sets. For over 40 years, Berkeley has led the world in recognizing and advancing the centrality of data in computing. Faculty and students at Berkeley have repeatedly defined and redefined the broad database field, combining deep intellectual impact with the birth of multi-billion dollar industries, including relational databases, RAID storage, scalable Internet search and big data analytics. Berkeley also gave birth to many of the most widely-used open source systems in the field including INGRES, Postgres, BerkeleyDB and Apache Spark. Today, our research continues to push the boundaries of data-centric computing, taking the foundations of data management to a broad array of emerging scenarios.

Topics

Scalable data analysis and query processing

Distributed machine learning, graph analytics, physical and logical optimization of machine learning pipelines, query processing on compressed data, performance prediction, streaming applications.
Low-latency model serving

Online model management and maintenance, prediction serving, real-time personalization, latency-accuracy tradeoffs and edge computing for large-scale models.
Consistency, concurrency, coordination and reliability

Coordination avoidance, consistency and monotonicity analysis, transaction isolation levels and protocols, distributed analytics, fault tolerance and fault injection.
Declarative languages and runtime systems

Declarative programming applied to distributed systems, networking, machine learning and interactive visualization.
Data storage and physical design

Hot and cold storage, immutable data structures, indexing, data skipping, versioning, implications of hardware evolution.
Metadata management

Data lineage, versioning, usage tracking and collective intelligence, scalability of metadata management services, metadata representations.
Data cleaning, data transformation and crowdsourcing

Human-data interaction including interactive transformation and crowdsourcing, machine learning for data cleaning, statistical properties of data cleaning pipelines.
Secure data processing

Data processing under homomorphic encryption, data compression and encryption, differential privacy, databases in secure hardware enclaves.
Data visualization and interaction

Interactive querying and transformation, progressive query visualization, predictive interaction, languages for interactive visualization.

Database Management Systems (DBMS)

Topics

Scalable data analysis and query processing

Low-latency model serving

Consistency, concurrency, coordination and reliability

Declarative languages and runtime systems

Data storage and physical design

Metadata management

Data cleaning, data transformation and crowdsourcing

Secure data processing

Data visualization and interaction