Dr. Latifur Khan is currently a full Professor (tenured) in the Computer Science department at the University of Texas at Dallas, USA where he has been teaching and conducting research since September 2000. He received his Ph.D. degree in Computer Science from the University of Southern California (USC) in August of 2000.
Dr. Khan is an ACM Distinguished Scientist and received IEEE Big Data Security Senior Research Award, in May 2019, and Fellow of SIRI (Society of Information Reuse and Integration) award in Aug, 2018. He has received prestigious awards including the IEEE Technical Achievement Award for Intelligence and Security Informatics and IBM Faculty Award (research) 2016.
Recently, he has become a Fellow of the British Computer Society (BCS), and Institution of Engineering and Technology (IET).
Dr. Latifur Khan has published over 300 papers in premier journals such as VLDB, Journal of Web Semantics, IEEE TDKE, IEEE TDSC, IEEE TSMC, and AI Research and in prestigious conferences such as AAAI, IJCAI, CIKM, ICDE, ACM GIS, IEEE ICDM, IEEE BigData, ECML/PKDD, PAKDD, ACM Multimedia, ACM WWW, ICWC, ACM SACMAT, IEEE ICSC, IEEE Cloud and INFOCOM. He has been invited to give keynotes and invited talks at a number of conferences hosted by IEEE and ACM. In addition, he has conducted tutorial sessions in prominent conferences such as SIGKDD 2017, 2016, IJCAI 2017, AAAI 2017, SDM 2017, PAKDD 2011 & 2012, DASFAA 2012, ACM WWW 2005, MIS2005, and DASFAA 2007.
Currently, Dr. Khan’s research area focuses on big data management and analytics, data mining and its application over cyber security, complex data management including geo-spatial data and multimedia data. His research has been supported by grants from NSF, NIH, the Air Force Office of Scientific Research (AFOSR), DOE, NSA, IBM and HPE.
Department of Computer Science, University of Texas at Dallas (UT Dallas)
Email: lkhan@utdallas.edu
DVP term expires December 2023
Presentations
Big Data Stream Analytics and Its Applications
Data streams are continuous flows of data. Examples of data streams include network traffic, sensor data, call center records and so on. Data streams demonstrate several unique properties that together conform to the characteristics of big data (i.e., volume, velocity, variety and veracity) and add challenges to data stream mining. In this talk we will present an organized picture on how to handle various data mining/machine learning techniques in data streams. In addition, we will present a number of stream classification applications such as adaptive website fingerprinting, textual stream analytics (political actor identification over textual stream), attack trace classification using good quality similarity metrics (metric learning) and domain adaptation.
This research was funded in part by NSF, NASA, Air Force Office of Scientific Research (AFOSR), NSA, IBM Research, HPE and Raytheon.
Data to Knowledge: Modernizing Political Event Data, Social media Data and Covid19 Images using Machine Learning (ML)
We have developed the software and big data infrastructure to provide machine coded event data from news reports from historical and real-time inputs from the web. The project is ongoing and will produce coded news reports based on Natural language processing (NLP) applications across English, and Spanish news reports. Human annotations and validations are conducted for data validation and cross-lingual support. Geo-location of the events is also improved for better spatial resolutions. One of the main computational challenges we address in this work is related to the efficiency and scalability of parsing online news articles in real-time. In particular, we designed a distributed system with Apache Spark and Kafka to process large amount of news articles for event coders and the actor recommender system. This system processes articles in near real-time while generating events which are provided to end users using our REST API at http://eventdata.utdallas.edu.
We are developing tool to scrap data from social media (Twitter) to extract actionable information to support first responders for Road traffic injuries (RTI) victims in low and middle-income countries. In the long term, this tool will provide new, low-cost technology based on semi-supervised learning that provides simple, accurate and reliable methods to improve the timeliness and accuracy of RTI reporting, shorten response times, and enhance triage decisions by first responders. We will demonstrate how this novel semi-supervised learning techniques can be applied to Covid19 image classification.
The event coding work is a collaborative effort with political scientists at UT Dallas, Dr. Patrick Brandt and Dr. Jennifer Holmes, funded by NSF. The extraction of actionable information from social media is collaborative work with University of Texas Southwestern Medical Center (UTSW), funded by NIH.
Secure Blockchain via Smart Contracts
With the increase in the adoption of blockchain technology in providing decentralized solutions to various problems, smart contracts have been becoming more popular to the point that billions of US Dollars are currently exchanged every day through such technology. Meanwhile, various vulnerabilities in smart contracts have been exploited by attackers to steal cryptocurrencies worth millions of dollars. The automatic detection of smart contract vulnerabilities is an essential research problem. Yet, existing solutions to this problem particularly rely on human experts to define features or different rules to detect vulnerabilities; which often lead to missing many vulnerabilities and they are inefficient detecting new vulnerabilities. In this study, we address these challenges and propose a framework to analyze the data and detect some vulnerabilities in Ethereum smart contracts on the blockchain platform. We apply machine learning-based(i.e., deep learning-based) vulnerability detection to relieve human experts from the tedious and subjective task of manually defining features and rules. For prevention, an Ethereum bytecode rewriting and validation method will be presented and evaluated for securing smart contracts in decentralized cryptocurrency systems without access to contract source code.