The Arc Learning Hub LHN-TR+11 Basement 2
Nanyang Technological University
63 Nanyang Drive, Singapore 636922
Coffee Breaks & Lunches: outside LT1A
Conference Dinner:02.10- Flower Field Hall- Bay South Garden,
Day 1, Wednesday, 2nd October
9:15-10:30 Session 1
Janusz A. Holyst: Opening
Eduardo G. Altmann: On the predictability of social-media items
Janusz A. Holyst: Detection of spread source in complex networks
10:30-11:00 Coffee break
11:00-13:00 Session 2 (4 talks)
Alexander V. Boukhanovsky: Digital Personalities Ecosystem for Professional Decision Support
Piotr Gorski: Escaping polarization
Mateusz Wilinski: Scalable learning of Independent Cascade model from partial observations
Przemyslaw Kazienko: Boosting information diffusion by divided seeding
14:00-16:00 Session 3 (4 talks)
Maria Sigova: Financial functions of social networks: problems and prospects
Klaudia Bochenina: Operational forecasting of reactions in online social networks based on user digital footprints
Tomaz Savodnik: Building blocks for emerging applications of information dynamics in social networks
Boleslaw Szymanski: Finding Efficient Spreaders for Information Diffusion in Social Networks
18:30 Conference Dinner
(Venue: Flower Field Hall- Bay South Garden, Gardens-by-the-Bay)
Day 2, Thursday, 3rd October
9:30 - 10:30 Session 1 (2 talks)
Joanna Toruniewska: Heider interactions in community of students
Julian Sienkiewicz: Categorical and Geographical Separation in Science
10:30 - 11:00 Coffee Break
11:00 - 13:00 Session 2 (4 talks)
Marko Grobelnik: Cross-lingual Global Media Monitoring
Krzysztof Suchecki: Changes of distances between users in Twitter network dynamics
Radoslaw Michalski: The application of the genetic algorithm to social influence maximization
Stanislaw Saganowsk: Group Evolution Prediction in complex networks
13:00 - 14:00 Lunch
14:00 - 17:00 RENOIR Internal Meeting
14:00- 14:15 Coordinator's presentation
14.15- 14:35 WP1: Knowledge and innovation exchange on data infrastructure for social information (STA, Sebastijan)
14.35- 14:55 WP2: Knowledge and innovation exchange on data-mining/machine- learning for reverse engineering of social information processing (JSI, Marko)
14:55 - 15:15 WP3: Knowledge exchange on modelling of information inference in social networks (WUT, Krzysztof)
15:15- 15:35 WP4: Training, sharing and dissemination of knowledge (PWR, Przemek)
15:35- 15:55 WP5: Management and Coordination (WUT, Anna)
15:55- 16:55 General Assembly Meeting*: all consortium members, Project Management Committee, Advisory Board
16:55- 17:00 Wrap- up, conclusions
* Proposed agenda of General Assembly Meeting:
1. Secondment’s Implementation,
2. Project book to be published in Springer,
3. New project,
4. Other issues.
(GA discusses strategic questions of the science of RENOIR and eventual problems related to the project management. AB assesses the scientific progress of the project and advises the PMC on possible improvements)
Detection of spread source in complex networks
Robert Paluch, Lukasz Gajewski, Xiaoyan Lu, Krzysztof Suchecki, Boleslaw K. Szymanski, Janusz A. Holyst
Spread over complex networks is a ubiquitous process with increasingly wide applications. Locating spread sources is often important, e.g. finding the patient one in epidemics, or source of rumor spreading in social network. Here we explore the problem of complexity of currently known methods as well as we investigate the validity of the assumption that information spreads only via the shortest paths.
Pinto, Thiran and Vetterli introduced an algorithm (PTVA) to solve the problem of source detection in which a limited set of nodes act as observers and report times at which the spread reached them. PTVA uses all observers to find a solution and assumes the information travels via a single, shortest path, which by assumption is the fastest way. Here we propose a new approach in which observers with low quality information (i.e. with large spread encounter times) are ignored and potential sources are selected based on the likelihood gradient from high quality observers. The original complexity of PTVA is O(Nα), where α ∈ (3, 4) depends on the network topology and number of observers (N denotes the number of nodes in the network). Our Gradient Maximum Likelihood Algorithm (GMLA) reduces this complexity to O(N2 log(N)) without reduction of the detection accuracy.
We also show that assumption that information spreads only via the shortest paths leads to the overestimation of propagation time for synthetic and real networks, where multiple shortest paths as well as longer paths between vertices exist. We propose a new method of source estimation based on maximum likelihood principle, that takes into account existence multiple shortest paths. It shows up to 1.6 times higher accuracy in synthetic and real networks.
Finding Efficient Spreaders for Information Diffusion in Social Networks
, Panagiotis Karampourniotis, Gyorgy Korniss
Recent global events and their poor predictability are often attributed to the complexity of the world event dynamics. Here, we use a simple classic threshold model of contagion spreading in complex social systems. Information propagates there with certain probability from nodes just activated to their non-activated neighbors . Diffusion is triggered by activation of even a small set of nodes with new idea. In our model, we allow individuals' susceptibility to this idea to be heterogeneous to investigate the threshold-limited cascades sizes as the distribution of thresholds varies from one with all agents having identical thresholds, to the other with each agent having a threshold drawn from the uniform distribution. We show that individuals' heterogeneity of susceptibility governs the dynamics, resulting in different sizes of initiators needed for consensus on a single opinion, so the case when the cascade created by the initiator includes all the nodes. We introduce two new selection strategies for influence maximization. One balances selection of nodes with high resistance to adoptions and nodes positioned in central spots in the network. The second strategy finds and selects nodes that increase the group's influence for many random choices of the members of the group.
Fig. 1. Performance of weights for the first strategy for different threshold distributions.
Our strategies outperform other existing strategies for many cases of the susceptibility diversity and network degree assortativity. The drawback is that both methods require a choice of control parameters according to the network properties. Our novel approach to select these parameters uses ML and features with low cost of computing. Random walks over the target network create its small models for training selection of the parameters This approach is illustrated by a realistic example.
Digital Personalities Ecosystem for Professional Decision Support
On the predictability of social-media items
Alexander V. Boukhanovsky, Claudia O. Bochenina, Sergey V. Kovalchuk
The concept of 'digital personality' (DP) is the extension of the 'digital twin' for the human or the society with the certain limitations. DP contains the mutually agreed set of the data (digital traces) and the models to formalize and predict relations between them. The digital traces are depending on the application area and may be aggregated from the various sources (social media, cell data, payments data etc.). The models mainly trained over the data by the machine learning techniques: they make simulation of the emotional, behavioral and decision making features on the external excitations (e.g. information spreading). We will consider the following kinds of digital personalities will be considered:
- Homo Digitalis Domus: the basic DP which describes the common features of the human, using their everyday digital trace;
- Homo Digitalis Virtual: the (partly) synthetic DP trained over the set of real humans' DP for the training and game systems;
- Homo Digitalis Pro: the interface of decision support system which allow to simulate the creative professional actions during the analysis of situation and the decision making;
- Homo Digitalis Civitas: society of the connected DPs, includes as the real as virtual agents for the describing of collective effects;
- Homo Digitalis Habitants: personally-tuned DP for the digital assistants.
The examples of these DPs' kinds applications in decision support systems in a healthcare, finances, smart city government and industrial personnel services will be presented.
Eduardo G. Altmann
The spreading of social-media items through a network is affected by a variety of different factors and shows non-trivial time dependencies. Complex-systems research tend to focus on universal processes (e.g., rich-get-richer mechanisms) that are able to describe the typical evolution, averaged over many items. In this talk I will argue for the study of the fluctuations around these average behaviours. I'll first show how socia-media time series (of views of YouTube videos and of tweets on specific topics) show extremely large fluctuations, much larger than expected by simplified models. These large fluctuations imply a limited possibility to forecast the temporal evolution of individual social-media items, even if the models for the average behaviour are accurate. I'll then discuss stochastic models that describe the fluctuations observed in such systems, shedding some light on the mechanisms responsible for their origins. For the case of views of YouTube videos, I'll show how a descriptions based on stochastic differential equations require the consideration of noise variables with non-Gaussian distribution. For the case of Twitter, I'll argue that the hubs in the (scale-free) network of followers play an important role but are not sufficient to account for the appearance of the large spikes of interest in short periods of time.
Building blocks for emerging applications of information dynamics in social networks
Classification and alignment of researched topics to emerging real-world problems
Understanding or even predicting information dynamics in social networks, such as speed and paths of information spreading, information reach and its influence on both recipient node and future network structure as well as information transformation along the paths has been of great interest to scientists since social networks emerged.
With goal to discover and reverse-engineer the mechanisms of such information spreading project RENOIR started in 2016. Since then couple of new related “trending topics” emerged or regained great public and business interest. Those topics include but are not limited to: fake news, content copyright, personal data protection, influencer marketing, (social) media brand/reputation monitoring and management.
By classification and aligning of researched topics to emerging real-world problems a catalog of proposed building blocks and processing pipelines could be compiled as high level application blueprint to address such problems.
It turns out that some emerging problems could be addressed by applying already researched methods, so “time to market” could be reasonably short. Defining viable business case for some applications - as opposed to researched method use case - is however challenging for some classes of emerging problems.
Financial functions of social networks: problems and prospects
Social networks are a big mass of data. In finance, data has always performed various functions. They signaled problems, served as a quantitative definition of a client’s creditworthiness, measured activity volumes, acted as activity indicators, etc. But with the transition to big data, they became a special function of financial escalation. In general, with their arrival, there was a qualitative leap in the interweaving of data flows with finances. Traditional information includes all information on customer accounts, internal and external reporting, data on cooperation with correspondent banks, creditors and debtors, clearing organizations, as well as stock quotes, exchange rates, etc. Along with this, banks are included in the process of collecting new information, which is beginning to go beyond the traditional, and include data, as well as from social networks. This information is extremely valuable, but it is extremely voluminous in volume, which requires new approaches to its collection, processing, storage, and use. Information coming from social networks often changes the behavior of banks and their relationships with customers. Often new information causes the problem of “information pollution” of banks and the need to work to “clean” them and bring the data into the proper form.
We have put forward and substantiate the hypothesis of interpenetration, interaction, and competition of financial and social networks. This study aims to substantiate this hypothesis. The objective basis for the formation of financial functions in social networks is, on the one hand, the digitization of the financial industry and its transition to big data, on the other hand, the existence of social networks in digital form and in big data. Such technological and informational convergence of the two spheres has prepared the ground for their interpenetration. However, between them, it is planned not only cooperation and interaction but also significant contradictions and competition over users of settlement and payment services and microcredits.
The interaction between traditional finance and the financial functions of social networks can be shown in terms of game theory. In this case, an analysis of the selected strategies is assumed. In our opinion, this analysis can be carried out from a position in some cases of cooperative games. Then it turns out the benefits of cooperation and can proceed to the analysis of the mechanism of cooperation. In other cases, you can use the formula of non-cooperative games, which allows you to find out the advantages of a particular strategy of behavior.
The general conclusion of the study: social networks are becoming an important source of information for the financial business and are included in its marketing strategies. However, their goal for finance is much broader; they translate financial connections into online relationships with consumers and also connect to the introduction of financial business.
Changes of distances between users in Twitter network dynamics
Boosting information diffusion by divided seeding
Twitter allows users to follow other users, allowing to track all their Tweets. These follows forma a network responsible for most of Twitter's ability to spread and transmit information. This network is not static, as users can change who they are following. This happens mostly in response to Tweets or Retweets and depends whether it is interesting for them or not.
Based on the data from Twitter, probability to Retweet, Unfollow and Follow given the interest similarity between users has been determined. A model using these has been formulated, simulated and analyzed anlytically. One of the most prominent results is that users who are similar become topologically closer in the network, while those dissimilar drift further away. This happens in real Twitter, simulations on real Twitter network as well as on purely artificial random networks. This behavior has been reproduced in the model as well as described quasi-analytically for random graphs. Predictions are in agreement with simulations and show only a small parameter range for stable behavior.
Cross-lingual Global Media Monitoring
Global media monitoring assumes dealing with large amounts of heterogeneous textual data across many languages in a near real-time. We will present a pipeline of machine learning, text mining, and semantic extraction components representing global social dynamics as an evolving network of interrelated events extracted from media. We will demonstrate the proposed approach on an operational system "Event Registry" (http://EventRegistry.org
) for (a) collecting media information from over 300,000 news and social media sources, (b) performing linguistic and semantic processing in multiple languages, (c) forming cross-lingual events and event sequences, (d) streaming information about events in open data formats, (e) rich visualizations, and (f) complex queries to analyse global social dynamics. Challenges and technical solutions will be discussed.
Information spreading in complex networks is often modeled as diffusing information with certain probability from nodes that possess it to their neighbors that do not. Information cascades are triggered when the activation of a set of initial nodes – seeds – results in diffusion to large number of nodes. Several novel approaches for seed initiation that replace the commonly used activation of all seeds at once with a sequence of initiation stages are introduced. Divisive, sequential strategies at later stages avoid seeding highly ranked nodes that are already activated by diffusion active between stages. The gain arises when a saved seed is allocated to a node difficult to reach via diffusion. Sequential seeding and a single stage approach are compared using various seed ranking methods and diffusion parameters on real complex networks. The experimental results indicate that, regardless of the seed ranking method used, sequential seeding strategies deliver better coverage than single stage seeding. Longer seeding sequences tend to activate more nodes but they also extend the duration of diffusion. Various variants of sequential seeding resolve the trade-off between the coverage and speed of diffusion differently. A coordinated execution of randomized choices enabled us to precisely compare different algorithms. Surprisingly, applying sequential seeding to a simple degree-based selection leads to higher coverage than achieved by the computationally expensive greedy approach currently considered to be the best heuristic.