Scientific Research>>Research Directions
Based on the national strategic tasks on cyberspace and industry development demands for big data, the Key Laboratory of Network Data Science and Technology in Chinese Academy of Sciences has been devoted to explore the complexity and computing theory for big data, situation sensing and data representation for the cyberspace, big data storing and management system, big data mining and social computing, network data management engine and big data platform, and information security, etc. Specifically, our research directions include:
1. Complexity of Network Data and Data Computing Theory
      Based on the properties of heterogeneity, polymorphism, dynamic emergency, and complicated correlation of big network data, the lab is studying laws, complexity theory and computing theory of network data. The research fields include: the exploration for the aggregation behaviors and propagation laws of network data, as well as structural and functional stability mechanism, and the underlying laws of network data; the nondeterministic and locally incremental learning theory, prediction of the big-picture trends and emergency laws of data evolution, as well as the promotion for novel algorithmic and theoretical basis of network data; the system architecture and algebra computing theory with weak CAP constraint, the distributed and stream computing algorithms, complexity measures, and the large scale distributed computing architectures, etc.
2. Cyberspace Sensing and Data Representation
      In face of the characteristics of cross-media correlation, strong time effect evolution and multi-agent interaction in cyberspace for big network data, the lab has explored the distribution free network data as well as the effective situation sensing and acquisition, conducted the quality evaluation and sampling of multi-source and heterogeneous data, measured the correlations, differences and significances among data from multiple source, and eventually achieved a unified representation of multi-source, heterogeneous and featured data. The research fields include the perception and measurement of data situations in cyberspace; quality evaluation, sampling and acquisition methods of network data; the cleaning, refinement and fusion representation of the multi-source and heterogeneous data.
3. Big Data Storing and Management
      Against the technological difficulties in big data storing and management under the integrated environment of “man-cyber-physical society”, the lab has conducted researches on novel data storing structures of high availability, good performance, ease of extension and low energy consumption as well as its key technologies. The specific research fields are: the efficiently integrated document storing structures, indexing mechanisms, weak consistency models and hierarchical storing service models; the network data and network service identity and data migration mechanisms, the efficient routing and intelligent transport mechanisms in centered of data, as well as the feature acquisition rules for multi-source dynamic data and representations for the service demand; the localizing strategy of data management, the decomposition of data computing tasks, plan generation and scheduling optimization algorithms; the entire design methods for physical resource management, data access engine, system operations and maintenance and other functional components in the network data storing system; distributed storing architectures as well as evaluations and benchmark data, etc.
4. Network Data Mining and Socialized Computing
      For the mining of large scale network data as well as researches on the laws and development trends of social network, our specific research fields include: the analysis and mining methods of user behavior data on a large scale heterogeneous data, the exploration for the measurement and fusion analysis methods for heterogeneous relation data; efficient and effective ranking models and other the machine learning models and methods for big data analysis; the information retrieval and recommendation methods for user-generated data and user behavior data; the stable statistical laws in social network, analysis and prediction of network evolution; the internal modeling of information interaction and diffusion in social media, as well as analysis and prediction of macroscopic situations for burst phenomenon and topic evolution in information diffusion, etc.
5. Related Technologies of Network Data Management Engine
      Targeting on the value of network data, the lab has developed a network data management engine and established a research platform of network big data on this basis. Regarding to the network data accumulation, the lab has achieved an incremental information acquisition of the original information by tens of millions of data every day as well as accumulated and managed billions of network data. Regarding to the knowledge accumulation, the lab has established a field-related socialized tag attribute set and entity ontology, guided by the requirement of industrial applications. It has supported many scientific researches on the network big data and high-value deep web information services. The specific research fields include: the data storing and computing architectures of exponentially expanded data; the sensing and acquisition of deep web data; the deep mining of heterogeneous correlations and hidden clues, etc.
6. Big Data and Information Security
      In facing of research problems such as the network vulnerability and threat analysis, network security management and control as well as network attack and defense, the direction of big data and information security is mainly about the researches of security controlling technologies based on information flow, network vulnerability analysis and evaluation, and the network confrontation technologies. At the same time, it has also strove to develop the cloud security, internet of things and other emerging applications. The specific research fields include: key technologies for multi-granularity information flow control in centered of data security; security label specifications satisfying the control demands of different granularities; the descriptive models and methods for security policy normalization based on labels; the formal analysis and authentication methods; the fine-grained data tracking technologies; the application of information flow controlling technologies in the cloud security and IOT security; the network vulnerability, threat correlation and prediction technologies; the situation analysis, prediction technologies and system of network security; the anomaly detection, feature extraction technologies and system of unknown network flow; the network confrontation technology; the acquisition, analysis and containment technologies of malicious codes, etc.